Skip to content

fix(interface): silence registry-default deprecation when library auto-fills it#655

Merged
nabinchha merged 1 commit into
mainfrom
fix/registry-default-warning-589
May 13, 2026
Merged

fix(interface): silence registry-default deprecation when library auto-fills it#655
nabinchha merged 1 commit into
mainfrom
fix/registry-default-warning-589

Conversation

@nabinchha
Copy link
Copy Markdown
Contributor

📋 Summary

Fix for the spurious ModelProviderRegistry.default is deprecated warning that fires for every fresh-install DataDesigner() construction, even when the user wrote default= nowhere — neither in YAML, nor in Python, nor in any ModelConfig. Reported by @johnny-greco in Slack while testing the GitHub plugin and the tutorials.

The reporter's snippet — pure DataDesignerConfigBuilder + a UUID sampler + bare DataDesigner().preview(...), with zero ModelConfigs anywhere — surfaces the warning attributed to the DataDesigner() line, with a remediation message ("Specify provider= explicitly on each ModelConfig") that doesn't apply.

🔗 Related Issue

Refs #589, follow-up to #594.

🔍 Root cause

resolve_model_provider_registry synthesises default=providers[0].name for the multi-provider case to satisfy check_implicit_default. The auto-seeded ~/.data-designer/model_providers.yaml ships three providers and no default: key, so this path is hit on every bare DataDesigner() call. _warn_on_explicit_default then attributes the warning to the user's DataDesigner() line — but the user never opted into the deprecated registry-level default; the library filled it in.

The existing suppression in DataDesigner.__init__ only covered the YAML-default-fallback case (where get_default_provider_name() already emitted the YAML deprecation). The library-auto-fill case was unsuppressed.

🔄 Changes

🔧 Changed

  • DataDesigner.__init__ — broaden the existing warnings.catch_warnings() suppression so it also covers model_providers is None (the "we're about to load the YAML and synthesise a default" case). Updated the comment to spell out both branches.
  • Users who hand-construct a multi-provider list in Python still see the warning (they wrote the multi-provider intent themselves); direct ModelProviderRegistry(default="x") always warns. Those are the entry points refactor: deprecate implicit default model provider routing #589 actually targets.

✨ Added

  • test_init_no_user_providers_no_yaml_default_stays_quiet — pins the bare-DataDesigner() quiet path against multi-provider YAML with no default: key, plus a behavioral pin that first-wins still resolves correctly. A future tightening of the suppression can't silently re-introduce the spurious warning.

🧪 Behavior matrix

Scenario Before After
Bare DataDesigner() (Slack repro + tutorial line 46) warns quiet
DataDesigner(model_providers=[single]) quiet quiet
DataDesigner(model_providers=[multi]) user-supplied warns warns (intentional — user wrote multi-provider intent)
Direct ModelProviderRegistry(default="x") warns warns (#589 entry point)
YAML default: key set YAML warning only YAML warning only

✅ Checklist

  • make check-all-fix clean
  • Targeted: 5/5 default-provider tests pass, including new regression
  • Broader: 118/118 across test_data_designer.py, test_model_provider.py, test_models.py
  • Verified all user-facing examples (tutorials, recipes, fern, devnotes, scripts) already pass provider= explicitly — PR feat(models): deprecate implicit default provider routing #594 covered them, no further example updates needed
  • Commits signed off (DCO)

…o-fills it

The ``ModelProviderRegistry.default is deprecated`` warning added in #594
fires for every fresh-install ``DataDesigner()`` construction, even when
the user wrote ``default=`` nowhere — neither in YAML, nor in Python, nor
in any ``ModelConfig``.

Root cause: ``resolve_model_provider_registry`` synthesises
``default=providers[0].name`` for the multi-provider case to satisfy
``check_implicit_default``. The auto-seeded
``~/.data-designer/model_providers.yaml`` ships three providers and no
``default:`` key, so this path is hit on every bare ``DataDesigner()``
call. ``_warn_on_explicit_default`` then attributes the warning to the
user's ``DataDesigner()`` line, with a remediation message ("Specify
provider= explicitly on each ModelConfig") that doesn't even apply when
the user hasn't built a ``ModelConfig`` (e.g. a UUID-only sampler config
with the GitHub plugin).

Fix: broaden the existing warning suppression in ``DataDesigner.__init__``
to also cover the ``model_providers is None`` case. The user is opting
into all defaults — the library is the one filling ``default=``, so the
deprecation nudge is misdirected. Users who hand-construct a
multi-provider list in Python still see the warning (they wrote the
multi-provider intent themselves), and direct
``ModelProviderRegistry(default="x")`` always warns — those are the
entry points #589 actually targets.

New regression test pins the bare-``DataDesigner()`` quiet path so a
future tightening of the suppression can't silently re-introduce the
spurious warning.

Refs #589, follow-up to #594.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
@nabinchha nabinchha requested a review from a team as a code owner May 13, 2026 20:02
@github-actions
Copy link
Copy Markdown
Contributor

Summary

PR #655 fixes a spurious ModelProviderRegistry.default is deprecated DeprecationWarning that fires on every fresh-install bare DataDesigner() call. The fresh-install ~/.data-designer/model_providers.yaml ships three providers and no default: key, so resolve_model_provider_registry synthesises default=providers[0].name to satisfy check_implicit_default — and _warn_on_explicit_default then attributes a deprecation nudge to a user who never opted in.

The fix broadens the existing warnings.catch_warnings() suppression in DataDesigner.__init__ so it covers both library-synthesised default cases:

  1. model_providers is None (caller passed nothing, library auto-fills from YAML)
  2. default_provider_name is not None (YAML had a default: key — pre-existing case, already covered)

A new regression test (test_init_no_user_providers_no_yaml_default_stays_quiet) pins the bare-init quiet path against multi-provider YAML with no default: key, plus a behavioural assertion on first-wins resolution.

Findings

Correctness — looks good

  • The library_synthesised_default = model_providers is None or default_provider_name is not None condition correctly captures both library-fill cases without leaking into user-driven entry points (model_providers=[multi] supplied directly, direct ModelProviderRegistry(default="x") construction). The behaviour matrix in the PR description matches the code.
  • The pre-existing branch condition default_provider_name is not None is still subsumed — when model_providers is None and the YAML carries a default, both clauses are True (harmless OR), and get_default_provider_name()'s YAML deprecation still fires before the registry build (outside the suppression scope), so users still see the actionable YAML warning. Verified in test_init_yaml_default_emits_single_deprecation_warning (unchanged).
  • get_default_provider_name() is called inside if model_providers is None: (line 164) — outside the catch_warnings() block, so the YAML-level warning remains visible in case 2. Suppression is correctly scoped to only the registry construction.

Code quality

  • The 23-line block comment (lines 169–190) is dense but warranted — this is the second deprecation-suppression iteration in this exact spot (PR feat(models): deprecate implicit default provider routing #594 → PR fix(interface): silence registry-default deprecation when library auto-fills it #655) and the rationale is non-obvious. The comment explicitly names which entry points refactor: deprecate implicit default model provider routing #589 still targets, which is the right discipline given the inverted "we suppress to help the user" semantics. Suggestion (optional): the trailing two paragraphs (about hand-constructed multi-provider lists and direct ModelProviderRegistry(default=...)) duplicate the PR description's behaviour matrix; if length is a concern, this could be trimmed. But I'd lean toward keeping it — the next person who touches this code will thank you.
  • library_synthesised_default is a one-shot local — could be inlined into the if. The named binding does aid readability and matches the comment's framing, so I'd keep it as is.
  • Naming nit: British spelling "synthesised" appears in both code and comments. The codebase uses both spellings (e.g., "synthesise" in this file's older comments), so this is consistent — no change needed.

Test coverage — strong

  • New test patches both get_default_providers and get_default_provider_name to simulate the fresh-install YAML shape (multi-provider, no default: → returns None). Correct simulation of the production code path.
  • Asserts both the absence of the registry-level deprecation and that first-wins resolution still works (get_default_provider_name() == "yaml-first"). The behavioural pin guards against a future "fix" that suppresses the warning by hard-coding default=None, which would break first-wins.
  • Docstring explicitly cross-references the counterpart test (test_init_user_supplied_providers_preserve_first_wins_over_yaml_default) — good navigability for the next maintainer.
  • warnings.simplefilter("always", DeprecationWarning) is correctly used to defeat any inherited filter state, matching the convention of the sibling test at line 625.

Risks / blast radius — minimal

  • Change is one boolean expression broadened from default_provider_name is not None to model_providers is None or default_provider_name is not None. No new code paths; suppression is still scoped to resolve_model_provider_registry(...) and only filters the exact deprecation message string.
  • No public API surface changes; no behavioural change for callers who already pass model_providers= explicitly.
  • Backwards-compatible — the suppression is strictly broader, so previously-quiet paths remain quiet.

Security — n/a

No auth, secret-handling, input-validation, or network surface touched. The test stubs use fake endpoints and PlaintextResolver() as the codebase convention.

Performance — n/a

No hot-path or import-time impact; the only added work is one boolean evaluation in __init__.

Minor observations (non-blocking)

  • The comment block ends with "those are the entry points refactor: deprecate implicit default model provider routing #589 targets" — this is great context. Consider whether a one-line # See PR #655 for the broadened suppression reference might help future archaeology, but the in-code reference to # See PR #594 review already establishes the pattern.
  • The behaviour matrix in the PR description is excellent and should arguably live alongside the code as a docstring or comment table — but that's a larger ergonomic improvement and out of scope here.

Verdict

Approve / LGTM (non-binding — automated review).

This is a tight, well-scoped fix for a high-visibility paper-cut (every fresh-install user hits this). The code change is one expression; the comment explaining why is appropriately defensive given this is the second iteration in this spot. The new regression test pins both the warning suppression and the first-wins behavioural contract that motivated #588, so the next refactor can't silently regress either property. The PR description, behaviour matrix, and test cross-references are exemplary.

No changes requested.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR fixes a spurious ModelProviderRegistry.default is deprecated warning that fired on every bare DataDesigner() construction when the default YAML ships multiple providers and no default: key — a case where resolve_model_provider_registry auto-synthesises the default on the user's behalf.

  • data_designer.py: Broadens the existing warnings.catch_warnings() suppression from default_provider_name is not None to model_providers is None or default_provider_name is not None, silencing the registry-level warning whenever the library, not the user, is responsible for filling in default=.
  • test_data_designer.py: Adds test_init_no_user_providers_no_yaml_default_stays_quiet to pin the quiet path — multi-provider YAML with no default: key — and verify that first-wins resolution still works correctly.

Confidence Score: 5/5

Safe to merge — the change is a minimal, well-scoped widening of an existing warning suppression, and a new regression test pins the exact scenario being fixed.

The suppression is tightly scoped to the single ModelProviderRegistry.default is deprecated message and only activates on the model_providers is None path, leaving user-supplied multi-provider lists unaffected. The behavior matrix in the PR description matches the code, the new test correctly exercises the previously-unsuppressed path, and the existing YAML-default test continues to pass. No logic is removed; only the guard condition is broadened.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer/src/data_designer/interface/data_designer.py Widens the warning suppression guard to cover the model_providers is None path; logic is correct and targeted to only the ModelProviderRegistry.default is deprecated message.
packages/data-designer/tests/interface/test_data_designer.py Adds a well-structured regression test that patches both YAML helpers, asserts no spurious deprecation warning, and pins first-wins provider resolution.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["DataDesigner.__init__(model_providers=?)"] --> B{model_providers is None?}
    B -- Yes --> C["_resolve_model_providers(None)\n→ calls get_default_providers()"]
    B -- No --> D["_resolve_model_providers(model_providers)\ndefault_provider_name = None"]
    C --> E["get_default_provider_name()"]
    E -- "returns None\n(no YAML default:)" --> F["default_provider_name = None"]
    E -- "returns 'name'\n(YAML default: key set)" --> G["default_provider_name = 'name'\n(YAML DeprecationWarning already fired)"]
    F --> H["library_synthesised_default = True\n(model_providers is None)"]
    G --> H
    D --> I["library_synthesised_default = False"]
    H --> J["warnings.filterwarnings ignore\n'ModelProviderRegistry.default is deprecated'"]
    J --> K["resolve_model_provider_registry(providers, default_provider_name)"]
    I --> K
    K --> L{Multi-provider\n& no explicit default?}
    L -- Yes --> M["synthesise default=providers[0].name\n_warn_on_explicit_default fires"]
    L -- No --> N["Registry built normally"]
    M -- suppressed if library_synthesised_default --> O["✅ Quiet (library filled it in)"]
    M -- not suppressed if user-supplied --> P["⚠️ DeprecationWarning visible to user"]
    N --> Q["✅ Registry ready"]
Loading

Reviews (1): Last reviewed commit: "fix(interface): silence registry-default..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@johnnygreco johnnygreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this one, @nabinchha, this is a tight fix with the right regression coverage.

Summary

This PR suppresses the registry-level default-provider deprecation warning only when DataDesigner itself is filling the registry default from YAML-backed construction, while preserving warnings for user-supplied multi-provider lists and direct ModelProviderRegistry(default=...) usage. The implementation matches the stated intent in the PR description.

Findings

No findings.

What Looks Good

  • The suppression is scoped narrowly around resolve_model_provider_registry, so unrelated deprecation warnings still surface normally.
  • The new regression test captures both parts of the desired behavior: no spurious warning for bare DataDesigner() and first-provider fallback still resolving correctly.
  • Existing tests already pin the complementary user-supplied and YAML-default: warning paths, which makes this small behavior change feel well boxed in.

Verdict

Ship it: ready to merge as-is.


This review was generated by an AI assistant.

@nabinchha nabinchha merged commit 3c8394e into main May 13, 2026
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants